{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "954e61ed-b3b3-4445-af90-19046f6c1da2",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Sub Question Query Engine\n",
    "In this tutorial, we showcase how to use a **sub question query engine** to tackle the problem of answering a complex query using multiple data sources.  \n",
    "It first breaks down the complex query into sub questions for each relevant data source,\n",
    "then gather all the intermediate reponses and synthesizes a final response."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "75ac0675-cde6-49ae-bfb3-4c43e6c4a718",
   "metadata": {},
   "source": [
    "### Preparation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "c3675b87-4d08-4f59-a971-48bf67c66ad0",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from llama_index import VectorStoreIndex, SimpleDirectoryReader\n",
    "from llama_index.tools import QueryEngineTool, ToolMetadata\n",
    "from llama_index.query_engine import SubQuestionQueryEngine\n",
    "from llama_index.callbacks import CallbackManager, LlamaDebugHandler\n",
    "from llama_index import ServiceContext"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "22b8b519-dd95-4dfb-9cdc-cb0004852036",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# load data\n",
    "pg_essay = SimpleDirectoryReader(input_dir=\"../data/paul_graham/\").load_data()\n",
    "\n",
    "# build index and query engine\n",
    "query_engine = VectorStoreIndex.from_documents(pg_essay).as_query_engine()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "484cb516-6692-4d07-9bd8-93941909b459",
   "metadata": {},
   "source": [
    "### Setup sub question query engine"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "dbff0cbd-9a1a-4acc-9870-03d05640125c",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# setup base query engine as tool\n",
    "query_engine_tools = [\n",
    "    QueryEngineTool(\n",
    "        query_engine=query_engine,\n",
    "        metadata=ToolMetadata(\n",
    "            name=\"pg_essay\", description=\"Paul Graham essay on What I Worked On\"\n",
    "        ),\n",
    "    )\n",
    "]\n",
    "\n",
    "# Using the LlamaDebugHandler to print the trace of the sub questions\n",
    "# captured by the SUB_QUESTION callback event type\n",
    "llama_debug = LlamaDebugHandler(print_trace_on_end=True)\n",
    "callback_manager = CallbackManager([llama_debug])\n",
    "service_context = ServiceContext.from_defaults(callback_manager=callback_manager)\n",
    "query_engine = SubQuestionQueryEngine.from_defaults(\n",
    "    query_engine_tools=query_engine_tools,\n",
    "    service_context=service_context,\n",
    "    use_async=False,\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "eeea8e15-78ab-4f72-a380-1a76bb5d5737",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Run queries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "a576ce5e-a6d1-470d-be51-84fbb31b4aa2",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Generated 2 sub questions.\n",
      "\u001b[36;1m\u001b[1;3m[pg_essay] Q: What did Paul Graham work on before YC?\n",
      "\u001b[0m\u001b[33;1m\u001b[1;3m[pg_essay] Q: What did Paul Graham work on after YC?\n",
      "\u001b[0m\u001b[33;1m\u001b[1;3m[pg_essay] A: \n",
      "After YC, Paul Graham worked on painting, writing essays, and Lisp.\n",
      "\u001b[0m\u001b[36;1m\u001b[1;3m[pg_essay] A: \n",
      "Before YC, Paul Graham worked on hacking, writing essays, and Arc, a programming language. He also wrote Hacker News, originally called Startup News, to test the new version of Arc.\n",
      "\u001b[0m**********\n",
      "Trace: query\n",
      "    |_llm ->  13.602417 seconds\n",
      "    |_sub_questions ->  4.580089 seconds\n",
      "    |_synthesize ->  6.2397 seconds\n",
      "      |_llm ->  6.235836 seconds\n",
      "**********\n"
     ]
    }
   ],
   "source": [
    "response = await query_engine.aquery(\n",
    "    \"How was Paul Grahams life different before and after YC?\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "222d16a7-089a-4de0-b551-fcd785f3eb26",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "Paul Graham's life was significantly different before and after YC. Before YC, he was focused on hacking, writing essays, and developing a programming language. After YC, he shifted his focus to painting, writing essays, and Lisp.\n"
     ]
    }
   ],
   "source": [
    "print(response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "fc38e65a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Sub Question 0: What did Paul Graham work on before YC?\n",
      "Answer: Before YC, Paul Graham worked on hacking, writing essays, and Arc, a programming language. He also wrote Hacker News, originally called Startup News, to test the new version of Arc.\n",
      "====================================\n",
      "Sub Question 1: What did Paul Graham work on after YC?\n",
      "Answer: After YC, Paul Graham worked on painting, writing essays, and Lisp.\n",
      "====================================\n"
     ]
    }
   ],
   "source": [
    "# iterate through sub_question items captured in SUB_QUESTIONS event\n",
    "from llama_index.callbacks.schema import CBEventType, EventPayload\n",
    "\n",
    "for start_event, end_event in llama_debug.get_event_pairs(CBEventType.SUB_QUESTIONS):\n",
    "    for i, qa_pair in enumerate(end_event.payload[EventPayload.SUB_QUESTIONS]):\n",
    "        print(\"Sub Question \" + str(i) + \": \" + qa_pair.sub_q.sub_question.strip())\n",
    "        print(\"Answer: \" + qa_pair.answer.strip())\n",
    "        print(\"====================================\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
